89 research outputs found

    The mythical thermohaline oscillator?

    Get PDF
    The system discussed by Stommel (1987) and Welander (1982), in which heating and evaporation at the surface of the ocean are balanced by vertical turbulent mixing, is studied analytically and numerically for mixing laws appropriate to salt fingers, rather than mechanical turbulence. Stommel and Welander found for mechanically-driven turbulent mixing that a limit cycle of T and S exists (that is, T and S oscillate) in the presence of steady forcing. We find that the usual salt finger parameterizations, in which salinity flux coefficient and buoyancy flux ratio decrease with increasing density ratio, do not allow a limit cycle. This result holds whether the flux parameterization is for an interface using the “4/3 power law” laboratory relationships or in terms of vertical gradients. Rather, all initial conditions either evolve to a steady balance or lead to the upper layer becoming denser than the lower layer and overturning. In addition, we find that commonly used mechanical turbulence parameterizations for eddy diffusivity vs. Richardson number do not vary rapidly enough to allow a limit cycle in the Stommel/Welander model, although recent observations of equatorial turbulence do. Hence the possible existence of a limit oscillation in evaporatively-driven areas of the ocean depends critically on the type of vertical mixing which occurs, and on the precise form of its parameterization

    Persistent Kernels for Iterative Memory-bound GPU Applications

    Full text link
    Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU implementations have a loop on the host side that invokes the GPU kernel as much as time/algorithm steps there are. The termination of each kernel implicitly acts as the barrier required after advancing the solution every time step. We propose a scheme for running memory-bound iterative GPU kernels: PERsistent KernelS (PERKS). In this scheme the time loop is moved inside a persistent kernel, and device-wide barriers are used for synchronization. We then reduce the traffic to device memory by caching a subset of the output in each time step in registers and shared memory to be used as input for the following time step. PERKS can be generalized to any iterative solver: they are largely independent of the solver's implementation. We explain the design principle of PERKS and demonstrate the effectiveness of PERKS for a wide range of iterative 2D/3D stencil benchmarks (geometric mean speedup of 2.292.29x in small domains and 1.531.53x in large domains), and a Krylov subspace solver (geometric mean speedup of 4.674.67x in smaller SpMV datasets from SuiteSparse and 1.391.39x in larger SpMV datasets, for conjugate gradient)

    Exploiting Scratchpad Memory for Deep Temporal Blocking: A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt)

    Full text link
    General Purpose Graphics Processing Units (GPGPU) are used in most of the top systems in HPC. The total capacity of scratchpad memory has increased by more than 40 times in the last decade. However, existing optimizations for stencil computations using temporal blocking have not aggressively exploited the large capacity of scratchpad memory. This work uses the 2D Jacobian 5-point iterative stencil as a case study to investigate the use of large scratchpad memory. Unlike existing research that tiles the domain in a thread block fashion, we tile the domain so that each tile is large enough to utilize all available scratchpad memory on the GPU. Consequently, we process several time steps inside a single tile before offloading the result back to global memory. Our evaluation shows that our performance is comparable to state-of-the-art implementations, yet our implementation is much simpler and does not require auto-generation of code.Comment: This is short paper is published in the 15th workshop on general purpose processing using GPU (GPGPU 2023

    Tunable violet radiation in a quasi-phase-matched periodically poled stoichiometric lithium tantalate waveguide by direct femtosecond laser writing

    Get PDF
    [EN]We report on violet-light generation using the femtosecond-laser written waveguides in periodically poled MgO:LiTaO3 crystal under conditions of third-order quasi-phase matching. Ten parallel depressed cladding waveguides are successfully fabricated with different grating periods in the same sample with fan-out χ(2) grating structures. These waveguides exhibit high optical quality with minimum insertion loss as low as 0.71 dB. Temperature and wavelength tuned second harmonic generation for different waveguides are demonstrated by using a tunable CW Ti sappire laser. Tunable violet second harmonic light has been generated with a single period over the range of 396 nm to 401 nm by varying the crystal temperature from 60 °C to 200 °C. At the quasi-phase matching temperature, 0.37 mW of violet light power at 397.2 nm is generated for a fundamental power of 336.7 mW, corresponding to a normalized conversion efficiency of 0.39%/(W·cm2). Our work contributes to designing tunable and efficient on-chip violet light sources based on femtosecond-laser written waveguides.This work was supported by the National Natural Science Foundation of China (Nos. 11874239 and 61775120); Major Program of Shandong Province Natural Science Foundation (Grant No. ZR2018ZB0649); National Key Research and Development Project (No. SQ2019YFA070063-01); MINECO (FIS2017-87970-R); and Ministerio de Economía y Competitividad de España (MAT2016-75362-C3-1-R)

    Second harmonic generation of femtosecond laser written depressed cladding waveguides in periodically poled MgO:LiTaO3 crystal

    Get PDF
    We report on the fabrication of depressed cladding waveguides in periodically poled MgO doped LiTaO3 by using low-repetition-rate femtosecond laser writing, and their use for guided-wave second harmonic generation (SHG). The cladding waveguides exhibit different guiding performance along the extraordinary and ordinary polarizations. The temperature-dependent quasi-phase-matching (QPM) is realized to obtain SHG in the depressed cladding waveguides. The results show that the QPM temperature was dependent on the poling period and on the features of the cladding waveguides. The highest nonlinear conversion efficiency (0.74%W−1cm−2) was found in the waveguide fabricated with large scanning velocity (0.75 mm/s) and small radius (15 μm).National Natural Science Foundation of China (NSFC) (61775120, 11874239); Junta de Castilla y León (Project SA046U16); Spanish Ministerio de Economía y Competitividad (MINECO, FIS2013-44174-P, FIS2015-71933-REDT)
    corecore